New Readability Measures for Bangla and Hindi Texts
نویسندگان
چکیده
In this paper we present computational models to compute readability of Indian language text documents. We first demonstrate the inadequacy and the consequent inapplicability of some of the popular readability metrics in English to Hindi and Bangla. Next, we present user experiments to identify important structural parameters of Bangla and Hindi that affect readability of texts in these two languages. Accordingly, we propose two different readability models for each Bangla and Hindi. The models are tested against a second round of user studies with completely new set of data. The results validate the propose models. Compared to the handful of existing works in Hindi and Bangla text readability, this paper presents the first ever definitive readability models for these languages incorporating their salient structural features.
منابع مشابه
Influence of Target Reader Background and Text Features on Text Readability in Bangla: A Computational Approach
In this paper, we have studied the effect of two important factors influencing text readability in Bangla: the target reader and text properties. Accordingly, at first we have built a novel Bangla readability dataset of 135 documents annotated by 50 readers from two different backgrounds. We have identified 20 different features that can affect the readability of Bangla texts; the features were...
متن کاملBidirectional Dependency Parser for Hindi, Telugu and Bangla
This paper describes the dependency parser we used in the NLP Tools Contest, 2009 for parsing Hindi, Bangla and Telugu. The parser uses a bidirectional parsing algorithm with two operations proj and non-proj to build the dependency tree. The parser obtained Labeled Attachment Score of 71.63%, 59.86% and 67.74% for Hindi, Telugu and Bangla respectively on the treebank with fine-grained dependenc...
متن کاملICON - 2008 6 th International Conference on Natural Language Processing
A system of machine translation under the framework of transfer-based grammar for Indian languages needs a set of rules for mapping the several syntactic as well as semantic facts of a source language on to the target language representations. Among these critical syntactico-semantic facts, this paper tries to approximate linguistic conditions for mapping rules for Hindi postposition transfer t...
متن کاملCohesive Readability of Expository Texts and Reading Comprehension Performance: Iranian EFL students of Different Proficiency Levels in Focus
Abstract The present study is an attempt to investigate the relationship between cohesive readability of expository texts and reading comprehension in EFL students with different proficiency levels. One hundred students formed the participant of this study. They were undergraduate students majoring in English at University of Isfahan. To collect the relevant data, participants were divide...
متن کاملQualitative and Quantitative Examination of Text Type Readabilities: A Comparative Analysis
This study compared 2 main approaches to readability assessment. Thequantitative approach applied idea density based on part of speech tagging andcompared 3 sets of text types (i.e., narrative, expository, and argumentative) withrespect to their ease of reading. The qualitative approach was done throughdeveloping questionnaires measuring intermediate EFL learners’ perceptions oncontent, motivat...
متن کامل